Using cellular automata for improving knn based spam filtering
نویسندگان
چکیده
As rapid growth over the Internet nowadays, electronic mail (e-mails) has become a popular communication tool. However, junk mail also, known as spam has increasingly become a part of life for users as well as internet service providers. To address this problem, many solutions have been proposed in the last decade. Currently, content-based anti-spam filtering methods are an important issue; the spam filtering is considered as a special case of binary text categorization. Many machine learning techniques have been developed and applied to classify email as spam or non-spam. In this paper, we proposed an enhanced K-Nearest Neighbours (KNN) method called Cellular Automaton Combined with KNN (CA-KNN) for spam filtering. In our proposed method, a cellular automaton is used to identify which instances in training set should be selected to classify a new e-mail; CA-KNN selects the nearest neighbours not from the whole training set, but only from a reduced subset selected by a cellular automaton.
منابع مشابه
A Novel Method for Detecting Spam Email using KNN Classification with Spearman Correlation as Distance Measure
E-mail is the most prevalent methods for correspondence because of its availability, quick message exchange and low sending cost. Spam mail appears as a serious issue influencing this application today's internet. Spam may contain suspicious URL’s, or may ask for financial information as money exchange information or credit card details. Here comes the scope of filtering spam from legitimate em...
متن کاملNon-Parametric Spam Filtering based on kNN and LSA
The paper proposes a non-parametric approach to filtering of unsolicited commercial e-mail messages, also known as spam. The email messages text is represented as an LSA vector, which is then fed into a kNN classifier. The method shows a high accuracy on a collection of recent personal email messages. Tests on the standard LINGSPAM collection achieve an accuracy of over 99.65%, which is an impr...
متن کاملYork University at TREC 2005: SPAM Track
We propose a variant of the k-nearest neighbor classification method, called instance-weighted k-nearest neighbor method, for adaptive spam filtering. The method assigns two weights, distance weight and correctness weight, to a training instance, and makes use of the two weights when classifying a new email. The correctness weight is also used in the maintenance of the training data to make the...
متن کاملA Novel Hybrid Approach for Email Spam Detection based on Scatter Search Algorithm and K-Nearest Neighbors
Because cyberspace and Internet predominate in the life of users, in addition to business opportunities and time reductions, threats like information theft, penetration into systems, etc. are included in the field of hardware and software. Security is the top priority to prevent a cyber-attack that users should initially be detecting the type of attacks because virtual environments are not moni...
متن کاملA New Model for Email Spam Detection using Hybrid of Magnetic Optimization Algorithm with Harmony Search Algorithm
Unfortunately, among internet services, users are faced with several unwanted messages that are not even related to their interests and scope, and they contain advertising or even malicious content. Spam email contains a huge collection of infected and malicious advertising emails that harms data destroying and stealing personal information for malicious purposes. In most cases, spam emails con...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Int. Arab J. Inf. Technol.
دوره 11 شماره
صفحات -
تاریخ انتشار 2014